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Description 

A Muxed-Output Double-Date-Rate-2 
(DDR2) Register with Fast Propagation 

Delay 

Cross Reference to Related Applications 

[0001] This application is a continuation-in-part of the co- 
pending application for "Data Register for Buffering Dou- 
ble-Data-Rate DRAMs with Reduced Data-Input-Path 
Power Consumption", U.S. Ser. No. 10/249,581, filed 

4/21/03. 
Background of Invention 

[0002] This invention relates to integrated circuits, and more 
particularly to differential buffer chips. 

[0003] Memory modules are widely used in electronic systems 
such as personal computers (PCs). Various standards are 
used, such as those by the Joint Electronic Device Engi- 
neering Council QEDEC). Some JEDEC standards use dou- 
ble-data-rate (DDR) dynamic-random-access memory 



(DRAM) chips on modules known as dual-in- 
line-memory-modules (DIMMs). A newer DDR-2 standard 
is also being implemented. Differential input signals are 
used for faster signaling. 

[0004] very high-speed buffer chips are needed for interfacing 

with the DDR-2 DRAM's. Each data line, and perhaps some 
address or control signals are buffered. Bi-directional data 
lines can be supported by using two uni-directional data- 
buffer slices in parallel but in reverse directions. 

[0005] Figure 1 shows a bit-slice for a data buffer chip that in- 
terfaces with DDR-2 DRAMs. Data inputs Dl, D2 are some 
of 25 or so data lines input to a buffer chip. Data input Dl 
is compared to a reference voltage Vref by differential 
buffer 12, then applied to the D-input of flip-flop 20. 
Likewise, data input D2 is compared to reference voltage 
Vref by differential buffer 14, then applied to the D-input 
of flip-flop 22. Vref is a reference voltage such as Vcc/2. 

[0006] The Ql output of inverting buffer 16 is a latched data bit 
that can be applied to one of the DDR-2 DRAM's data in- 
puts. The Q2 output of inverting buffer 18 is another 
latched data bit that can be applied to another one of the 
DDR-2 DRAM's data inputs. 

[0007] when SEL is low, mux 24 selects the upper input, causing 



Ql to be driven from the latched Dl from flip-flop 20. 
When SEL is high, mux 24 selects its lower input, causing 
Ql to be driven from the latched D2 from flip-flop 22. SEL 
can be a mode signal that is low to indicate 1:2 mode, but 
high to indicate 1:1 mode. In 1:1 mode, tow different out- 
puts are generated from two different inputs, but in 1:2 
mode two outputs are generated from the same (D2) in- 
put. 

[0008] clock buffer 26 receives a differential clock CK and CKB, 
and generates a clock edge to flip-flops 20, 22 when the 
differential clock signals cross-over. Reset signal RST can 
be applied to differential buffers 12, 14, clock buffer 26, 
and flip-flops 20, 22. 

[0009] while such a data buffer is useful, an added clock- 
to-output propagation delay occurs for the Ql data, 
which passes through mux 24 compared with the Q2 data 
that does not have to be delayed by mux 24. Mux 25 may 
include transmission gates and inverter buffers needed to 
re-generate signals that are reduced in strength by the 
effective resistance of the transmission gates. 

[0010] since tight delay times are specified by the JEDEC stan- 
dard, the data-path delay may have to be reduced, such 
as by using a higher-speed buffer 16 or larger- 



drive-current transistors in mux 24. However, increasing 
the speed of buffer 16 requires a large current, which in- 
creases power consumption. Since there can be as many 
as 25 bit slices such as shown in Fig. 1 in a buffer chip, a 
large overall power consumption can occur. Such large 
power consumptions are undesirable. 
[0011] what is desired is a buffer chip with lower power dissipa- 
tion. A faster clock-to-output data output path from the 
flip-flop is desirable without relying on large-current dif- 
ferential input buffers. 
Brief Description of Drawings 

[0012] Figure 1 shows a bit-slice for a data buffer chip that in- 
terfaces with DDR-2 DRAMs. 

[0013] Figure 2 shows a 2-bit slice of a buffer chip with a re- 
duced clock-to-output delay by integration of the output 
data mux with the flip-flops. 

[0014] Figure 3 is a schematic of a muxing flip-flop with differ- 
ent inputs. 

[0015] Figure 4 is a schematic of a muxing flip-flop with identical 
inputs. 

[0016] Figure 5 is an alternate embodiment with differential 
clocks. 

[0017] Figure 6 shows a muxing flip-flop that has differential 



clocks as inputs. 
[0018] Figure 7 is a schematic of a muxing flip-flop that has dif- 
ferential clocks as inputs and the same data input. 
Detailed Description 

[0019] The present invention relates to an improvement in buffer 
chips. The following description is presented to enable 
one of ordinary skill in the art to make and use the inven- 
tion as provided in the context of a particular application 
and its requirements. Various modifications to the pre- 
ferred embodiment will be apparent to those with skill in 
the art, and the general principles defined herein may be 
applied to other embodiments. Therefore, the present in- 
vention is not intended to be limited to the particular em- 
bodiments shown and described, but is to be accorded 
the widest scope consistent with the principles and novel 
features herein disclosed. 

[0020] The inventor has realized that clock-to-output data path 
delays can be reduced if the output mux can be elimi- 
nated. Since the mux is in the critical path, removal of the 
mux can reduce propagation delays and allow for smaller 
buffers to be used for the data output path. The smaller 
buffers can result is a significant power reduction since 
one buffer is need for each of the 25 or so data input 



slices. The multiplexing function can be integrated with 
the flip-flops so that the inputs to the flip-flops are 
muxed rather than their outputs. 

[0021] while the mux could simply be moved to before the flip- 
flop inputs, this could cause a different problem. The ad- 
ditional mux delay before the flip-flop could cause an in- 
crease in the data set-up time to the flip-flop. Thus sim- 
ply moving the mux may not solve all problems. Rather 
than move the mux, the inventor merges the muxing 
function with the flip-flop itself. 

[0022] Figure 2 shows a 2-bit slice of a buffer chip with a re- 
duced clock-to-output delay by integration of the output 
data mux with the flip-flops. Differential input buffers 12, 
14 compare data inputs DIP, D2P to reference voltage 
Vref to generate single-ended data inputs Dl, D2 when 
reset RST is not active. 

[0023] The single-ended data inputs Dl, D2 are input to muxing 
flip-flops 40. Muxing flip-flops 40 have an input mux to 
the master stage, allowing either one of the two data in- 
puts Dl, D2 to be latched into the master. The slave stage 
does not need muxing logic and can efficiently drive the 
output. 

[0024] The Q1P output is generated by inverting buffer 16, which 



directly receives the Ql output of muxing flip-flops 40, 
eliminating the output mux delay of Fig. 1. Likewise, the 
Q2P output is generated by buffer 18, which directly re- 
ceives the Q2 output of muxing flip-flops 40. 

[0025] jo implement the desired DDR-2 output-muxing scheme, 
the Ql output of muxing flip-flops 40 can be driven by a 
muxed-input flip-flop that receives both Dl and D2, while 
the Q2 output is driven by a standard flip-flop with a sin- 
gle input, D2. Alternately, the Q2 output can be driven by 
a muxed-input flip-flop that has D2 as both inputs to the 
mux. Signal delays can be better matched when both Ql 
and Q2 data paths are similar. 

[0026] clocking of muxing flip-flops 40 is more complex than of 
a standard flip-flop. Three clocks are used. Clock buffer 
30 generates a clock edge of DCK to muxing flip-flops 40 
when differential clock signals CK, CKB cross-over. DCK is 
a free-running clock that clocks the slave stages in mux- 
ing flip-flops 40. 

[0027] while slave-stage clock DCK is always running, only one 
of master clocks CK1:2 and CK1:1 is running at a time, 
depending on the mode. For 1:1 mode, master-stage 
clock CK1:1 pulses, while clock CK1:2 is static, non- 
pulsing. For 1:2 mode, master-stage clock CK1:2 pulses, 



while clock CK1:1 is static, non-pulsing. 

[0028] clock CK1:1 clocks in data from one of the two inputs to 
the muxed-input flip-flop, while clock CK1:2 clocks in 
data from the other of the two inputs to the muxed-input 
flip-flop in muxing flip-flops 40. Thus the mode, 1:1 or 
1:2, determines which of clocks CK1:1 or CK1:2 is puls- 
ing, and which of the two muxed data inputs is active and 
which is disabled. Thus the muxing function is controlled 
by clocks CK1:1, CK1:2. 

[0029] clock buffers 32, 34 receive the reset signal RST and a 

mode select signal SEL, or the inverse of SEL. When SEL is 
high, mode 1:2 is selected. The high SEL disables clock 
buffer 34, disabling clock CK1:1, while inverter 10 drives 
a low to the reset input of clock buffer 32, allowing clock 
CK1:2 to pulse. 

[0030] when SEL is low, mode 1:1 is selected. The low SEL en- 
ables clock buffer 34, allowing clock CK1:1 to pulse, while 
inverter 10 drives a high to the reset input of clock buffer 
32, disabling clock CK1:2. When reset RST is active, all 
clock buffers 30, 32, 34 are disabled from pulsing. Clock 
buffers 30, 32, 34 each generate a clock edge to muxing 
flip-flops 40 when differential clock signals CK, CKB 
cross-over and when its reset input is inactive (low). 



[0031] Figure 3 is a schematic of a muxing flip-flop with differ- 
ent inputs. Muxing flip-flops 40 of Fig. 3 can include the 
muxed-input flip-flop that latches in either Dl or D2 and 
drives the Ql output, and the muxed-input flip-flop that 
latches in only D2 and drives the Q2 output, shown in Fig. 
4. 

[0032] All clocks are generated from input clock CK, as shown in 
Fig. 5. When the input clock CK is high, the master stage 
feeds back while the slave stage latches the master data. 
When the input clock CK is low, the slave stage feeds back 
while the master stage samples either input data Dl or 
D2. 

[0033] slave clock DCK is inverter first by inverter 74 to generate 
DCKB, then inverted again by inverter 80 to generate 
DCKC. The slave clock pulses in both 1:1 and 1:2 modes. 
First master clock CK1:1 pulses high and low in 1:1 mode 
but is held high (inactive) in 1:2 mode. Inverter 78 inverts 
CK1:1 to generate CK1:1B, and inverter 84 re-inverts 
CK1:1B to generate CK1:1C. Second master clock CK1:2 
pulses high and low in 1:2 mode but is held high (inactive) 
in 1:1 mode. Inverter 76 inverts CK1:2 to generate 
CK1:2B, and inverter 82 re-inverts CK1:2B to generate 
CK1:2C. Inverter 72 inverts reset signal RST to generate 



RSTB. 

[0034] The slave stage has a first transmission gate of transistors 
60, 62 which are opened when DCK is high and DCKB is 
low. The feedback path is interrupted by transistors 66, 
68 which conduct in the opposite clock state, when DCK is 
low. The output of transmission gate transistors 60, 62 is 
input to NOR gate 70, which drives feedback to p-channel 
transistor 64, which is in series with p-channel transistor 
66, and n-channel transistor 69, which is in series with n- 
channel transistor 68. The Ql output is taken from the 
output of transmission gate transistors 60, 62, which is 
also the drains of feedback transistors 66, 68. 

[0035] The master stage has two input transmission gates, and 
two pairs of feedback transistors to perform the muxing 
function. One transmission gate is opened and closed by a 
clock, while the other transmission gate remains closed 
while its feedback transistors remain on. 

[0036] The first transmission gate includes transmission gate 

transistors 50, 58 and inputs Dl to the master stage when 
clock CK1:1 pulses low. The second transmission gate in- 
cludes transmission gate transistors 52, 54 and inputs D2 
to the master stage when clock CK1:2 pulses low. Either 
clock CK1:1 pulses and clock CK1:2 remains high, when 



1:1 mode is selected, or clock CK1:2 pulses and clock 
CK1:1 remains high, when 1:2 mode is selected. Thus the 
master stage samples only one input, Dl or D2, depend- 
ing on the mode selected. 

[0037] The master-stage output is taken from inverter 56 and 
drives the transmission gate into the slave stage. Feed- 
back data within the master stage from inverter 56 is ap- 
plied to the gates of p-channel transistor 42 and n- 
channel transistor 49. 

[0038] a feedback gate includes p-channel transistors 42, 44, 45 
and n-channel transistors 46, 48, 49 in series. The feed- 
back gate drives the input of inverter 56. When the input 
clock CK is high, both CK1:1C and CK1:2C are high, caus- 
ing n-channel transistors 46, 48 to conduct. Likewise, 
when input clock CK is high, both CK1:1B and CK1:2B are 
low, causing p-channel transistors 44, 45 conduct. When 
the output of inverter 56 is high, n-channel feedback 
transistor 49 is on and p-channel feedback transistor 42 
is off, driving the input of inverter 56 low. When the out- 
put of inverter 56 is low, n-channel feedback transistor 49 
is off and p-channel feedback transistor 42 is on, driving 
the input of inverter 56 high. The data applied to feed- 
back transistors 42, 49 is thus inverted during recycling. 



[0039] The feedback gate stops conducting when input clock CK 
is low, since either CK1:1C or CK1:2C is low, turning off 
n-channel transistor 48 or 46, respectively, and either 
CK1:1B or CK1:2B is high, turning off p-channel transistor 
44 or 45, respectively. 

[0040] | n i i mode, data input Dl passes through first transmis- 
sion gate transistors 50, 58 when CK1:1B pulses high, and 
is inverted by inverter 56 and later latched into the slave 
stage when DCK goes low. Output Ql is thus driven from 
input Dl. This is the 1:1 mode. CK1:2B remains low in 1:1 
mode, so second transmission gate transistors 52, 54 re- 
main off as the primary clock CK pulses. Second feedback 
transistors 45, 46 remain on. Data is fed back from in- 
verter 56 to p-channel transistor 42 and n-channel tran- 
sistor 49, and is recycled to the input of inverter 56 when 
CK1:1B pulses low, and first feedback transistors 44, 48 
turn on. 

[0041] | n 12 mode, data input D2 passes through second trans- 
mission gate transistors 52, 54 when CK1:2B pulses high, 
and is inverted by inverter 56 and later latched into the 
slave stage when DCK goes low. Output Ql is driven from 
input D2. This is the 1:2 mode. CK1:1B remains low in 1:2 
mode, so first transmission gate transistors 50, 58 remain 



off as the primary clock CK pulses. First feedback transis- 
tors 44, 48 remain on. Data is fed back from inverter 56 
to p-channel transistor 42 and n-channel transistor 49, 
and is recycled to the input of inverter 56 when CK1:2B 
pulses low, and second feedback transistors 45, 46 turn 
on. 

[0042] Figure 4 is a schematic of a muxing flip-flop with identical 
inputs. Both the first transmission gate (transistors 150, 
158) and the second transmission gate (transistors 152, 
154) are connected to data input D2. Thus regardless of 
whether 1:1 or 1:2 mode is selected, data input D2 is 
latched to generate output Q2 from the drains of transis- 
tors 166, 168 in the slave stage. 

[0043] otherwise, the structure and operation is similar to that 
described for Fig. 3. Delay to output Q2 are similar to de- 
lays for output Ql since the structure of the flip-flop of 
Fig. 4 is so similar to the structure of the flip-flop of Fig. 
3. Such delay matching may be beneficial in some appli- 
cations. Alternately, a standard flip-flop could be used for 
latching D2 to generate Q2, replacing the circuit of Fig. 4. 

[0044] when reset RST is high, NOR gate 170 drives its output 
low. This causes Q2 to go high, which is inverted by in- 
verter 18 (Fig. 2) to drive the final Q2 output low during 



reset. 

[0045] Figure 5 is an alternate embodiment with differential 

clocks. Differential may be originally input to the register 
chip. Differential signaling can continue into the chip to 
the transmission gates of the muxing flip-flops. Fig. 5 is 
similar to Fig. 3, except that differential clocks are output 
from clock buffers 130, 132, 134 and input to muxing 
flip-flops 140 as differential clocks. 

[0046] For example, clock buffer 130 receives input clock CK, 
CKB and generates DCK, DCKN that are input to muxing 
flip-flops 140. Clock buffer 132 also receives differential 
input clock CK, CKB and generates CK1:2, CK1:2N, while 
clock buffer 134 receives input clock CK, CKB and gener- 
ates CK1:1, CK1:1N. 

[0047] Muxing flip-flops 140 receive a pair of complementary 

signals for each clock. While slave clocks DCK, DCKN con- 
tinuously pulse in both 1:1 and 1:2 modes, master clocks 
CK1:1, CK1:1N pulse only in 1:1 mode, while CK1:1 stays 
high and CK1:1N stays low in 1:2 mode. Similarly, master 
clocks CK1:2, CK1:2N pulse only in 1:2 mode, while CK1:2 
stays high and CK1:2N stays low in 1:1 mode. 

[0048] Figure 6 shows a muxing flip-flop that has differential 

clocks as inputs. Inverter 274 generates DCKB from posi- 



tive differential clock DCK, while inverter 276 generates 
DCKNB from negative differential clock DCKN. DCKNB is 
applied to the gates of n-channel transmission gate tran- 
sistor 260 and p-channel feedback transistor 266 in the 
slave stage, while DCKB is applied to the gates of p- 
channel transmission gate transistor 262 and n-channel 
feedback transistor 268 in the slave stage. The slave's in- 
put transmission gate is thus turned on when DCK and 
DCKNB are high, and DCKB is low. 

[0049] inverter 282 generates CK1:1B from positive differential 
clock CK1:1, while inverter 284 generates CK1:1NB from 
negative differential clock CK1:1N. CK1:1B is applied to 
the gates of n-channel transmission gate transistor 250 
and p-channel feedback transistor 244 in the master 
stage to sample Dl, while CK1:1NB is applied to the gates 
of p-channel transmission gate transistor 258 and n- 
channel feedback transistor 248 in the master stage. The 
master's Dl input transmission gate is thus turned on 
when CK1:1 and CK1:1NB are low, and CK1:1B is high. 

[0050] inverter 278 generates CK1:2B from positive differential 
clock CK1:2, while inverter 280 generates CK1:2NB from 
negative differential clock CK1:2N. CK1:2B is applied to 
the gates of n-channel transmission gate transistor 252 



and p-channel feedback transistor 245 in the master 
stage to sample D2, while CK1:2NB is applied to the gates 
of p-channel transmission gate transistor 254 and n- 
channel feedback transistor 246 in the master stage. The 
master's D2 input transmission gate is thus turned on 
when CK1:2 and CK1:2NB are low, and CK1:2B is high. 

[0051] Operation is otherwise similar to that described earlier for 
the circuit of Fig. 4. Figure 7 is a schematic of a muxing 
flip-flop that has differential clocks as inputs and the 
same data input. Data input D2 is applied to both the first 
transmission gate of transistors 350, 358, and to the sec- 
ond transmission gate of transistors 352, 354. Thus in 
both 1:1 and 1:2 modes, data D2 is sampled to generate 
Q2 from the slave stage. 

[0052] Having differential clocks propagated though the muxing 
flip-flops can improve performance. For the worst-case 
clocks, one less inverter delay is needed when differential 
internal clocks are used, as in Figs. 5-7, compared with 
the single-ended clocks of Figs. 2-4. DCKC requires the 
clock buffer 30 plus two inverters 74, 80 (Figs 2, 3), while 
DCKNB requires clock buffer 130 plus only one inverter 
276 (Figs 5, 6). Thus one inverter delay is saved using full 
differential clocking. 



[0053] ALTERNATE EMBODIMENTS 

[0054] Several other embodiments are contemplated by the in- 
ventor. For example, different buffering, gating, and logic 
may be substituted. Inverters, NAND or NOR gates could 
be added to the clock or data buffers, or these gates can 
be replaced with other logic such as transmission gates 
and buffers or switch networks. Signals can be active high 
or active low. 

[0055] clocks could be free-running and yet still be disabled for 
power-saving or other disabling modes. Clocks could be 
free-running for shorter periods of time, such as when 
data is being transferred to or from the DRAM chips, while 
the clocks are disabled for other periods of time when the 
DRAM is not being accessed. The reset signals could be 
activated for these non-access time periods, or other dis- 
abling or power-down signals could be used. Global or 
local or some combination of buffering and inverting can 
be used. 

[0056] The master and slave stages could be set or reset by 

adding various logic gates. For example, the slave can be 
set to 1 by a NOR gate 70 and an inverter, using an ac- 
tive-high reset to the NOR gate, or reset to 0 using a 
NAND gate with an active-low reset that replaces the 



NAND gate. The master stage could be set or reset in a 
similar manner by changing inverter 56 to a NAND or NOR 
gate. Keeper or leaker transistors could be added, as 
could capacitors and resistors or other passive compo- 
nents. Inverters and buffers could be added to the output, 
and multiple outputs or differential outputs could be gen- 
erated. The muxing flip-flops 40 invert the data input, but 
a non-inverting flip-flop could be constructed by taking 
the output Ql from the output of NOR gate 70 rather than 
from the input of NOR gate 70. 
[0057] A n y advantages and benefits described may not apply to 
all embodiments of the invention. When the word "means" 
is recited in a claim element, Applicant intends for the 
claim element to fall under 35 USC Sect. 112, paragraph 
6. Often a label of one or more words precedes the word 
"means". The word or words preceding the word "means" 
is a label intended to ease referencing of claims elements 
and is not intended to convey a structural limitation. Such 
means-plus-function claims are intended to cover not 
only the structures described herein for performing the 
function and their structural equivalents, but also equiva- 
lent structures. For example, although a nail and a screw 
have different structures, they are equivalent structures 



since they both perform the function of fastening. Claims 
that do not use the word "means" are not intended to fall 
under 35 USC Sect. 112, paragraph 6. Signals are typically 
electronic signals, but may be optical signals such as can 
be carried over a fiber optic line. 
[0058] The foregoing description of the embodiments of the in- 
vention has been presented for the purposes of illustra- 
tion and description. It is not intended to be exhaustive or 
to limit the invention to the precise form disclosed. Many 
modifications and variations are possible in light of the 
above teaching. It is intended that the scope of the inven- 
tion be limited not by this detailed description, but rather 
by the claims appended hereto. 



